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Abstract 

It is well recognized that discontinuous analysis increments of sequential data assim- 
ilation systems, such as ensemble Kalman filters, might lead to spurious high frequency 
adjustment processes in the model dynamics. Various methods have been devised to con- 
tinuously spread out the analysis increments over a fixed time interval centered about 
analysis time. Among these techniques are nudging and incremental analysis updates 
(IAU). Here we propose another alternative, which may be viewed as a hybrid of nudging 
and IAU and which arises naturally from a recently proposed continuous formulation of 
the ensemble Kalman analysis step. A new slow- fast extension of the popular Lorenz-96 
model is introduced to demonstrate the properties of the proposed mollified ensemble 
Kalman filter. 



1 Introduction 

Given a model in form of an ordinary differential equation and measurements at discrete in- 
stances in time, data assimilation attempts to find the best possible approximation to the true 



dynamics of the physical system under consideration (Evensen, 2006). Data assimilation by 



sequential filtering techniques achieves such an approximation by discontinuous-in-time adjust- 
ment of the model dynamics due to available measurements. While optimality of the induced 
continuous-discrete filtering process can be shown for linear systems, discontinuities can lead to 
unphysical readjustment processes under the model dynamics for nonlinear systems and under 
imperfect knowledge of the error probability distributions. See, e.g., |Houtekamer and Mitchell 



(2005); Kepert (2009) in the context of operational models as well as Neef et al. (2006); Oke 



et al. (2007) for a study of this phenomena under simple model problems. These observations 



have led to the consideration of data assimilation systems that seek to incorporate data in a 
more "continuous" manner. In this paper we focus on a novel continuous data assimilation 
procedure based on ensemble Kalman filters. Contrary to widely used incremental analysis 



updates (IAU) of Bloom et al. (1996), which first do a complete analysis step to then distribute 
the increments evenly over a given time window, our approach performs the analysis step it- 
self over a fixed window. Our novel filter technique is based on the continuous formulation 



of the Kalman analysis step in terms of ensemble members (pergemann and Reich[ 2010|) and 

The 



mollification of the Dirac delta function by smooth approximations (Friedrichs, 1944). 
proposed mollified ensemble Kalman (MEnK) filter is described in Section |2j The MEnK filter 
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may be viewed as a "sophisticated" form of nudging (Hoke and Anthes, 1976 Macpherson 



1991) with the nudging coefficients being obtained from a Kalman analysis perspective instead 



of heuristic tuning. However, for certain multi-scale systems, nudging with prescribed nudging 
coefficients might still be advantageous (Ballabrera-Poy et al. 2009). We also point to the 



closely related work by Lei and Stauffer (2009), which proposes a nudging-type implementation 



of the ensemble Kalman filter with perturbed observations (Burgers et al. 1998). 



To demonstrate the properties of the new MEnK filter, we propose a slow-fast extension 
of the popular Lorenz-96 model (Lorenz, 1996) in Section 111 Contrary to other multi-scale 



extensions of the Lorenz-96 model, the fast dynamics is entirely conservative in our model 
and encodes a dynamic balance relation similar to geostrophic balance in primitive equation 
models. Our model is designed to show the generation of unbalanced fast oscillations through 
standard sequential ensemble Kalman filter implementations under imperfect knowledge of the 
error probability distributions. Imperfect knowledge of the error probability distribution can 
arise for various reasons. In this paper, we address in particular the aspect of small ensemble 
size and covariance localization (Houtekamer and Mitchell 2001 Hamill et al. , 2001). It is 



also demonstrated that both IAU as well as the newly proposed MEnK filter maintain balance 
under assimilation with small ensembles and covariance localization. However, IAU develops an 
instability in our model system over longer assimilation cycles, which requires a relatively large 
amount of artificial damping in the fast dynamics. We also note that the MEnK filter is cheaper 
to implement than IAU since no complete assimilation cycle and repeated model integration 
need to be performed. It appears that the MEnK filter could provide a useful alternative to the 
widely employed combination of data assimilation and subsequent ensemble re-initialization. 
See, for example, Houtekamer and Mitchell (2005). 



2 Mollified ensemble Kalman filter 

We consider models given in form of ordinary differential equations 

i = /(x,t) (1) 

with state variable xGK™. Initial conditions at time to are not precisely known and we assume 
instead that 

x(t )~N(x ,B), (2) 

where N(x , B) denotes an n-dimensional Gaussian distribution with mean x G M. n and co- 
variance matrix B G IR nxn . We also assume that we obtain measurements y(tj) G MP at 
discrete times tj > to, j '• = 0, 1, . . . , M, subject to measurement errors, which are also Gaussian 
distributed with zero mean and covariance matrix R G lR pxp , i.e. 

y(f i )-Hx(f i )~N(0,R). (3) 

Here H G IR pxn is the (linear) measurement operator. 

Ensemble Kalman filters ( |Evensen , 2006) rely on the simultaneous propagation of m inde- 
pendent solutions Xj(t), i = 1, . . . ,m, of (|lj), which we collect in a matrix X(i) G IR nxm . We 
can extract an empirical mean 



f ft 
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and an empirical covariance matrix 



1 m 

p(t) = — - J2 (*(*) - m) (Mt) - my 

m — 1 z — ' 



(5) 



from the ensemble. In typical applications from meteorology, the ensemble size m is much 
smaller than the dimension n of the model phase space. We now state a complete ensemble 
Kalman filter formulation for sequences of observations at time instances tj, j = 1, . . . , M, and 
intermediate propagation of the ensemble under t he dynamics (jlj). Specifi cally, the continuous 
formulation of the ensemble Kalman filter step by Bergemann and Reich (2010) allows for the 



following concise formulation in terms of a single differential equation 



M 



(6) 



in each ensemble member, where 5(-) denotes the standard Dirac delta function, Vj is the 
potential 



Vi(X 

with observational cost function 

S,(x) = 



&(x) + 



1 m 1 

i=i J 



Hx 



y(i,)) T R" 



[Hx - yfo)) ■ 



(7) 



(8) 



One may view ^ as the original ODE ([I]) driven by a sequence of impulse like contributions 
due to observations. See the Appendix for a derivation of 

It should be noted that (|6| is equivalent to a standard Kalman filter (and hence optimal) for 
linear models and the number of ensemble members m larger than the dimension of phase space 
n. This property does no longer hold under nonlinear model dynamics and it makes sense to 
"mollify" the discontinuous analysis adjustments as, for example, achieved by the popular IAU 



technique (Bloom et al. 1996). In the context of (|6|), a most natural mollification is achieved 
by replacing (|6j) with 



M 



/(x,,t)-^5 e (t-t,)PV Xl V,(X) 



(9) 



3=1 



where 



ip{s) is the standard hat function 



: i>(s/e) 



for \s\ 
else 



< 1, 



(10) 



and e > is an appropriate parameter. The hat function (10) could, of course, be replaced 
by another B-spline. We note that the term mollification was introduced by Friedrichs (1944) 



to denote families of compactly supported smooth functions 5 e which approach the Dirac delta 
function 5 in the limit e — > 0. Mollification via convolution turns non-standard functions 
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(distributions) into smooth functions. Here we relax the smoothness assumption and allow for 
any non-negative, compactly supported family of functions that can be used to approximate 
the Dirac delta function. A related mollification approach to numerical time-stepping methods 
has been proposed in the context of multi-rate splitting methods for highly oscillatory ordinary 
differential equations in |Garcia-Archilla et al. (1998); Izaguirre et al. (1999). 

Fo rmulations (|6~|) a nd ( 9 ) arc based on square-root filter formulations of the ensemble Kalman 
filter (Tippett et al. 2003). Ensemble inflation (Anderson and Anderson, 1999) is a popular 
techniques to stabilize the performance of such filters. We note that ensemble inflation can 
be interpreted as adding a 9 (x; — x) term to the right hand side of Q for an appropriate 
parameter value 9 > 0. We note that a related "destabilizing" term appears in continuous 
filter formulations for linear systems. See, for example, Simon (2006) for details. The link to 
Hqo filtering suggests alternative forms of ensemble inflation and a systemtic approach to make 
ensemble Kalman filters more robust. Details will be explored in a forthcoming publication. 



Localization (Houtekamer and Mitchell, 2001; Hamill et al. 2001) is another popular tech- 



nique to enhance the performance of ensemble Kalman filters. Localization can easily be imple- 
mented into the mollified formulation ^ and leads to a modified covariance matrix P = C o P 
in g. Here C is an appropriate localization matrix and C o P denotes the Schur product of 
C and P. See Bergemann and Reich (2010) for details. 

We finally note that both (|6j) and ^ can be modified to become consistent with an ensemble 



Kalman filter with perturbed observations (Burgers et al., 1998). This requires to adjust the 



potentials V,- and to add a stochastic forcing term. 



3 Algorithmic summary 

Formulation ^ can be solved numerically by any standard ODE solver. Similar to nudging, 
there is no longer a strict separation between ensemble propagation and filtering. However, 
based on (|9]), "nudging" is performed consistently with Kalman filtering in the limit e — > 0. An 
optimal choice of e will depend on the specific applications. 

Using, for example, the implicit midpoint method as the time-stepping method for the 
dynamic part of ([9j and denoting numerical approximations at = kAt by xf, we obtain the 
following algorithm: 

x*+ 1 = x* + At |/(xf +1/2 ,t fc+1/2 ) - |>*P fc V Xl V,(X fc ) j , (11) 

for i = 1, . . . , m. Here X fc is the collection of ensemble approximations at tk, P k is the resulting 
covariance matrix, x. k+1 ^ 2 = (x^ +1 + x^)/2 is the midpoint approximation, and the weights 
a k > are defined by 

a) = U ( tfc "f tobs ) (12) 
with the hat function (10) and normalization constant c > chosen such that 

AtJ2 a j = 1 - ( 13 ) 



Note that the summation in (11) needs to be performed only over those (measurement) indices 

k 
J 



j with a k 7^ 0. 
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For the numerical experiments conducted in this paper, we set e = At ^ s /2 and assume 
that At bs is a multiple of the time-step size At. Under this setting only a single measurment 
y(tj) is assimilated at any given time-step Another interesting choice is e = At D b s which 
implies that . a* is independent of the time-step index k. In this case, two measurements are 
simultaneously assimilated (with different weights) at any given time-step tk- 

The main additional cost of (11) over standard implementations of ensemble Kalman filters 
is the computation of the covariance matrix P in each time-step. However, this cost is relatively 
minor compared to the cost involved in propagating the ensemble under the model equations 
(fl]) in the context of meteorological applications. We note that, contrary to IAU and the recent 
work by Lei and Stauffer (2009), no additional model propagation steps are required. 



4 A slow-fast Lorenz-96 model 



We start from the standard Lorenz-96 model (Lorenz, 1996 Lorenz and Emanuel, 1998) 

Xi = (X l+ i - £;_ 2 )XZ-1 - Xi + 8 



(14) 



for I — 1, . . . , 40 with periodic boundary conditions, i.e., X-\ = 239, xq = £40, and £41 = x\. 
We note that the energy 
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-^Lorenz „ ^ %l 



1=1 

is preserved under the "advective" contribution 

&l — \ x l+l ~ %l-2)%l-l = Xi^\Xi + i — ££-2^-1 

to the Lorenz-96 model. 



(15) 



(16) 



We now derive a slow-fast extension of (14). To do so we introduce an additional variable 



h, which satisfies a discrete wave equation, i.e. 



e 2 hi = -hi + a 2 [h i+1 - 2h x + h 



1-1 



(17) 



/ = 1, . . . ,40, where e > is a small parameter implying that the time evolution in h is fast 
compared to that of x in the Lorenz model (14) and a > is a parameter determining the 
wave dispersion over the computational grid. We assume again periodic boundary conditions 
and find that (17) conserves the energy 

_2 4 



2 40 -^40 



(18) 



1=1 



1=1 



The two models (14) and (17) are now coupled together by introducing an exchange energy 
term 
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E, 



coupling 



-S^hjXi, 



(19) 



1=1 



where 5 > characterizes the coupling strength. The equations of motion for the combined 
system are then defined as 



xi = (1 - S) (xi+x - xi- 2 ) xi-! + S (xi-ihi+i - xi- 2 hi-i) - xi + 
e 2 'hi = -h l + a 2 [h l+1 -2hi + h l - 1 ]+xi, 



(20) 
(21) 



5 



non-balanced dynamics 




time 



Figure 1: Norm of (27) over all grid points for several values of e as a function of integration 
time. It can be concluded that balance is maintained to high accuracy for sufficiently small 
values of e except for perturbations induced at initial time. The amplitude of the oscillations 
could be reduced further by more sophisticated initialization techniques (higher order balance 
conditions). 



typically x and h fields 




Figure 2: Typical balanced {x{\ and {hi} fields for the slow-fast Lorenz model. It can be seen 



that the balanced {hi} is more regular than the {xi} field due to the balance relation (26). 
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where we have scaled the advective contribution in ( 14 ) by a factor of 1 — 5 to counterbalance 



the additional contribution from the coupling term. We note that the pure wave-advection 
system 



8% 



(1 - 5) (xi+i - £2-2) xi-! + 5 (xi-!hi+i - xi- 2 hi-!) 
-hi + a 2 [h l+ i - 2hi + h^] + x t 



conserves the total energy 



H 



((5 l)-£'Loronz -^coupling 

6 
2 



5 40 (5-1 

' J2 \ -^ %2 i + e2 ' h2 i +h 2 + o? (h l+1 - h^ 2 - 2xih 



=1 
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independent of the choice of 5 G [0, 1]. We also wish to point to the balance relation 

x t = hi- a 2 [hi +1 - 2hi + hi- X \ 



(22) 
(23) 



(24) 
(25) 

(26) 



in the wave part (21). More specifically, if (26) is satisfied at initial time, the variables {hi} 
and {xi} will remain in approximate balance over a fixed time interval with the deviation from 



exact balance proportional to e (Wirosoetisno and Shepherd, 2000 Wirosoetisno 2004; Cotter 
and Reich , 2006 ) . We demonstrate conservation of balance for decreasing values of e in Fig. |l| 
where we display the Euclidean norm of 



A t = xi - hi + a 2 [hi +1 - 2hi + h t _i] 



(27) 



1 = 1,..., 40, as a function of time. Hence, we may view (26) as defining an approximative slow 
manifold on which the dynamics of (20 )-(21 ) essentially reduces to the dynamics of the standard 
Lorenz-96 model for sufficiently small e. We use e = 0.0025 for our numerical experiments. 
On the other hand, any imbalance introduced through a data assimilation system will remain 
in the system since there is no damping term in (21). This is in contrast to other slow-fast 



extensions of the Lorenz-96 model, such as Lorenz 



(1996), which also introduce damping and 



forcing into the fast degrees of freedom. We will find in Section [5] that some ensemble Kalman 
filter implementations require the addition of an "artificial" damping term into (21) and we 
will use 

e 2 hi = -hi + a 2 [h l+1 - 2h t + + x t - ^e 2 hi (28) 
for those filters with an appropriately chosen parameter 7 > 0. 



We mention that (20 )-(21 ) may be viewed as an one-dimensional "wave-advection" extension 



of the Lorenz-96 model under a quasi-geostrophic scaling regime with e being proportional to the 
Rossby number and Bu = a 2 being the Burger number (Salmon, 1999). In a quasi-geostrophic 



regime, the Burger number should be of order one and we set a = 1/2 in our experiments. Since 
a typical time-scale of the Lorenz-96 model is about 0.025 units, we derive at an equivalent 
Rossby number of Ro ~ 0.1 for e = 0.0025. Typical {hi} and {xi} fields are displayed in Fig. [2j 
It should, of course, be kept in mind that the Lorenz-96 model is not the discretization of a 
realistic fluid model. 

The dynamics of our extended model depends on the coupling strength 5 G [0,1]. Since 
the {hi} field is more regular than the {xi} field, stronger coupling leads to a more regular 
dynamics in terms of spatial correlation. We compute the grid- value mean x = (x{) and its 
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Figure 3: Best RMS error for slow-fast Lorenz-96 model with coupling strength 5 = 0.1 using 
ensembles of size m — 10 and k = 20 observations of {xj} taken in intervals of At t, s = 0.05 over 
a total of 4000 assimilation cycles. We compare a standard ensemble Kalman filter (EnKF) 
implementation to the MEnK filter and a EnKF with IAU. We also implement a standard 
EnKF for the damped wave equation (28) with damping factor 7 = 0.1. 



variance a = ((xi — x) 2 ) 1 ^ 2 along a long reference trajectory for 5 = 0.1 (x = 2.32, a = 3.68), 
5 = 0.5 {x = 1.80, a = 3.67), and 5 = 1.0 {x = 1.48, a = 3.69), respectively. The variance is 
nearly identical for all three parameter values while the mean decreases for larger values of S, 
which implies that the dynamics becomes less "non-linear" for larger values of 5. 



5 Numerical experiments 



We run the system for a coupling strengths of S = 0.1 and 5 = 0.5, respectively, using either 
a (standard) ensemble Kalman filter implementation based on ^ or an implementation of the 
mollified formulation ([9]). We also recall that the mollified formulation Q uses the mollifier 
([10]) with e = A obs /2. 

We observe {x{\ at every second grid point in time-intervals of At b s = 0.05 with mea- 
surement error variance R = l2o- The time-step for the numerical time-stepping method 



(a second-order in time, time-symmetric splitting method (Leimkuhler and Reich, 2005)) is 
At = At obs /20 = 0.0025. 



use localization ( 


Houtekamer and Mitchell 


2001 


(Anderson and Anderson 


1999 


). Localization is 



multiplying each element of the covariance matrix P by a distance dependent factor p^j/. This 



factor is defined by the compactly supported localization function (4.10) from Gaspari and 
Cohn (1999), distance r^/ = min{|/ — /'|,40 — \l — l'\}, where / and /' denote the indices of the 
associated observation/grid points x\ and xy, respectively, and a fixed localization radius r . 



See Bergemann and Reich (2010) for more implementation details. Inflation is applied after 



each time-step to the {x{\ components of the ensemble members. 
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Figure 4: Unbalanced dynamics generated through the data assimilation steps in a standard 
ensemble Kalman filter (EnKF) implementation as compared to the relatively low and nearly 
constant unbalanced wave activities under the MEnK filter and the IAU EnKF implementation 
over 500 assimilation steps and for coupling constant 5 = 0.1 and localization radius r = 2. 



The initial ensemble is generated by adding small random perturbations to a reference {x{\ 



field. The associated {hi} ensemble fields are computed from the balance relation (26). After 
a spin-up period of 200 assimilation cycles, we do a total of 4000 assimilation cycles in each 
simulation run and compute the RMS error between the analyzed predictions and the true 
reference solution from which the measurements were generated. RMS errors are computed 
over a wide range of inflation factors and only the optimal result for a given localization radius 
is reported. 

We first compare the performance of a standard ensemble Kalman filter implementation 
with the proposed MEnK filter for a coupling strength 5 = 0.1. It can be seen from Fig. [3] 
that the RMS error in the {hi} fields is much larger for the standard ensemble Kalman filter 
than for the mollified scheme. The difference is most pronounced for small localization radii. 
Indeed, it is found that the standard filter generates significant amounts of unbalanced waves. 
See Fig. |4j where we display the Euclidean norm of (27) over all ensemble members for the 
first 500 assimilation steps and localization radius ro = 2. Note that A; = at t = and that 



no damping is applied to the wave part (21). We also implement a standard ensemble Kalman 
filter for the damped wave equation (28) with 7 = 0.1 which leads to a significant reduction 
(but not complete elimination) in the RMS errors for the {hi} fields. Qualitatively the same 
results are obtained for the stronger coupling 5 = 0.5, see Fig. [5j 



For comparison we implement an IAU along the lines of Bloom et al. (1996); Polavarapu et al. 



(2004) with the analysis increments provided by a standard ensemble Kalman filter. Instead of 



a constant forcing, we use the hat function (10) for computing the weights g(t) in Bloom et al. 



(1996) to be consistent with the implementation of the MEnK filter. While it is found that the 



resulting IAU ensemble Kalman filter conserves balance well initially (sec Fig.jlJ, the IAU filter 
eventually becomes highly inaccurate and/or unstable. We believe that this failure of IAU over 
longer analysis cycles (here 4000) is due to an instability of the algorithm in the fast wave part 
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Figure 5: Same as Fig. [3] except that the coupling strength is set to 5 = 0.5. 



(21 ). The instability could be caused by the non-symmetric-in-time nature of the IAU process, 
i.e., by the fact that one first computes a forecast, then finds the analysis increments based 
on the forecast and available observations, and finally repeats the forecast with assimilation 



increments included. It should also be kept in mind that the IAU of Bloom et al. ( 1996 )) is based 
on a standard 3D Var analysis for given background covariance matrix, which is significantly 
different from an IAU ensemble Kalman filter implementation. A stable implementation of an 



IAU is achieved by replacing (21) by the damped version (28) with 7 = 1.0. Smaller values of 



7 lead to filter divergence. With damping included, the IAU filter behaves almost identical to 



the MEnK filter and the standard ensemble Kalman filter with damping in the wave part (28), 
see Figs. |3]and[5j 

We also tested the dependence on the type of measurements and found that observation of 
the mixed {(hi + xi)/2} field instead of {xi} leads to results which are in qualitative agreement 
with those displayed in Fig. [3] for all three methods, see Fig. [6} 

In summary, we find that localization interferes with dynamic balance relations in standard 
ensemble Kalman filter implementations. This result has been well established by a number of 
previous studies on models of varying complexity. Here we have demonstrated for a relatively 
simply model system that strong localization can be used without the need of additional damp- 
ing and/or re-initialization provided the assimilation increments are distributed evenly over a 
time-window instead of being assimilated instantaneously. It appears that the proposed MEnK 
filter works particularly reliably. It should be kept in mind that the MEnK filter leads to ad- 
ditional costs per model integration step. But these costs should be relatively small compared 
to those of the model dynamics itself. 



6 Conclusion 

In this paper, we have addressed a shortcoming of sequential data assimilation systems which 
arises from the discontinuous nature of the assimilation increments and the imperfect statis- 
tics under ensemble based filtering techniques. In line with the previously considered IAU 
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slow-fast system with coupling 6 = 0.1 




4 6 8 10 12 14 

localization radius 



Figure 6: Same as Fig. [3] except that observations of {(xj + hj)/2} are taken in intervals of 
Atobs = 0.05 with measurement error variance R = I20 over a total of 4000 assimilation cycles. 



and nudging techniques, we have proposed spreading the analysis through a mollified Dirac 
delta function. Mollification has been used before to avoid numerical instabilities in multi-rate 
time-stepping methods for the integration of highly oscillatory differential equations (see, for 
example, Garcia-Archilla et al. (1998); Izaguirre et al. (1999)). We have demonstrated for a 
simple slow-fast extension of the Lorenz-96 model that the proposed MEnK filter indeed elim- 
inates high-frequency responses in the dynamic model. The MEnK filter can be viewed as a 
nudging technique with the nudging coefficients determined consistently from ensemble predic- 
tions. We note in this context that a combined ensemble Kalman filter and nudging technique 
was found beneficial by Ballabrera-Poy et al. (2009) for the original slow-fast Lorenz-96 model 
(Lorenz, 1996). The MEnK filter (perhaps also combined with the filtering-nudging approach of 



Ballabrera-Poy et al. (2009)) may provide an alternative to currently used combinations of en- 
semble Kalman filters and subsequent re-initialization. For the current model study it is found 
that artificial damping of the unbalanced high-frequency waves stabilizes the performance of a 
standard ensemble Kalman filter and leads to results similar to those observed for the MEnK 



filter. We also found, in line with previous studies of Houtekamer and Mitchell (2005); Oke 



et al. (2007); Kepert (2009), that stronger localization leads to a more pronounced generation 



of unbalanced waves under a standard ensemble Kalman filter. 

Speaking on a more abstract level, the MEnK filter also provides a step towards a more 
integrated view on dynamics and data assimilation. The MEnK filter is closer to the basic 
idea of filtering in that regard than the current practice of batched assimilation in intervals of 
6 to 9 hours, which is more in line with four dimensional variational data assimilation (which 
is a smoothing technique). The close relation between our mollified ensemble Kalman filter, 
continuous time Kalman filtering and nudging should also allow for an alternative approach to 
the assimilation of non-synoptic measurements (Evensen, 2006). 

An IAU ensemble Kalman filter has also been implemented. It is observed that the filter 
eventually becomes unstable and/or inaccurate unless a damping factor of 7 = 1.0 is applied 
to the fast wave part of the modified Lorenz-96 model. We did not attempt to optimize the 
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performance of the IAU assimilation approach. 

(see 



It is feasible that optimally chosen weights 



Polavarapu et al. (2004) for a detailed discussion) could lead to a stable and accurate 
performance similar to the proposed MEnK filter. It should be noted that the MEnK filter 
requires less model integration steps than the IAU ensemble Kalman filter. The same statement 
applies to the recent work of Lei and Stauffer (2009). Indeed, the work of Lei and Stauffer 



(2009) is of particular interest since it demonstrates the benefit of distributed assimilation in 
the context of the more realistic shallow-water equations. We would also like to emphasize 



that IAU ensemble Kalman filtering, the hybrid approach of Lei and Stauffer (2009) and our 



MEnKF filter have in common that they point towards a more "continuous" assimilation of 
data. Such techniques might in the future complement the current operational practice of 
batched assimilation in intervals of, for example, 6 hours at the Canadian Meteorological Centre 



Houtekamer and Mitchell (2005) 



We finally mention that an alternative approach to reducing the generation of unbalanced 
dynamics is to make covariance localization respect balance. This is the approach advocated, 



for example, by Kepert (2009). 
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Appendix 

We have stated in iBergemann and Reich (2010) without explicit proof that an ensemble Kalman 



filter analysis step at observation time tj is equivalent to 

= Xi (t7) - f PV Xi V,-(X) ds, 
Jo 



(29) 



where Xj(^- ) denotes the forecast values an d x^(t^) is the analyzed state of the zth ensemble 
member. The reader was instead referred to Simon] (2006). 



Before we demonstrate how (29) gives rise to rt6), we provide a derivation of (29) for the 



simplified situation of n — 1, k — 1, i.e. the covariance matrices P and R are scalar quantities 
and the observation operator H is equal to one. Under these assumptions the standard Kalman 
analysis step gives rise to 

P f R 



Pn 



and 



P f + R 
XfR + yPf 

Pf + R 



(30) 
(31) 



for a given observation value y. 

We now demonstrate that this update is equivalent to twice the application of a Kalman 
analysis step with R replaced by 2R. Specifically, we obtain 



PL 



2P m R 
P m + 2R } 



P f + 2R 



(32) 



for the resulting covariance matrix P' a with intermediate value P m . The analyzed mean x' a is 
provided by 

2x m R + yP m = 2x f R + yP f 



x„ 



Pm + 2R 



J, 



P f + 2R 

We need to demonstrate that P a = P' a and x a = x' a . We start with the covariance matrix and 
obtain 



PL 



4P f R 

pJTm 



R 



AP f R 2 



P f R 



2P f R 
P f +2R 



+ 2 R 4P 7 i? + AR 2 Pf + R 



Pn 



(34) 
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A similar calculation for x' a yields 

2x f R+yP f D . 2P f R 

¥ _ 2 ^Tm LR + ypjTm _ 4x^ 2 + AyP f R _ x f R + yP f _ 

2R+^ 4Ri + 4RP f P f + R W 

Hence, by induction, we can replace the standard Kalman analysis step by K > 2 iterative 
applications of a Kalman analysis with R replaced by KR. We set P = Pf, x = Xf and 
perform 

KP k R _ Kx k R + yP k 

Pk+l ~ P k + KR' Xk+1 ~ P k + KR (36) 

for k = 0, . . . ,K — 1. We finally set P a = Pk and x a = x K . Next we introduce a step-size 
As = 1/K and assume K 1. Then 

x k+ i = * k v^Z Pk = - A^i?" 1 (x k -y) + 0(As 2 ) (37) 



as well as 



= 5 P 5 = P * " ^P k R^P k + 0(As 2 ). (38) 
Taking the limit As — > 0, we obtain the two differential equations 

HP Hr 

^- = -PRT l P, ^ = -PR- 1 (x - y) (39) 
ds ds \ y) v v 

for the covariance and mean, respectively. The equation for P can be rewritten in terms of its 
square root Y (i.e. P = Y 2 ) as 

£ " -5 pirlK (40) 

We now consider an ensemble consisting of two members sci and x-x- Then x = (xi + x%)/2 and 

P = - x 2 ) 2 , r = _L(. Tl -a; 2 ). (41) 



Hence 



^-(x l + x 2 ) = -PR- l (x 1 + x 2 -2y), ^-{x 1 -x 2 ) = - l -PR-\x l -x 2 ), (42) 
ds ds 2 

which gives rise to 

(IT • I 

^ = --Pi?- 1 (x, + x-2y) (43) 

for i = 1,2. Upon integrating these differential equations from s = with initial conditions 
Xi(0) = Xi(t~), i = 1,2, to s = 1 and setting = Xj(l) we have derived a particular 

instance of the general continuous Kalman formulation (29). 

The general, non-scalar case can be derived from a multiple application of Bayes' theorem 
as follows. We denote the Gaussian prior probability distribution function by 7T/(x) and the 
analyzed Gaussian probability density function by vr a (x). Then 

7r a (x) oc 7r/(x) x exp(-Sj(x)) (44) 



15 



with Sj{x) defined by Bayes' formula (44) can be reformulated to 



7T a (x) OC 7T/(x) X J^exp (- — S^xj , 

k=l \ ' 



(45) 



which gives rise to incremental updates 

7T fe+ i(x) OC 7T fe (x) X exp 



(46) 



for k — 0, . . . , K — 1 with ttq = ivf. The incremental update in the probability density functions 
7Tfe gives rise to a corresponding incremental Kalman update step, which leads to the desired 
continuous formulations in the limit K — > oo. 

Having shown that an ensemble Kalman filter analysis step is equivalent to (29), we next 
demonstrate that ^ is the appropriate mathematical formulation for the complete data as- 
similation scheme. To do so we replace the Dirac delta function 5(t — tj) in ^ by a function 
that takes the constant value 1/e on the interval [tj — e/2,tj + e/2] and zero elsewhere. This 
function approaches the Dirac delta function as e — > 0. Then we obtain from (|6| 



Xifc - e/2 + 5) = Xifo - e/2) + f 

Jo 



: PV Xl V,(X) 



dr 



(47) 



for S e [0, e] , where we have assumed for simplicity that the ODE ([T| is autonomous. We now 
rescale integration time to s — r/e, which leads to 



x .( t . _ e / 2 + e A) = Xifa - e/2) + / [e/(x 4 ) - PV Xl V,(X)] ds, 

Jo 



(48) 



with A G [0, 1]. We arrive at fl29~|) in the limit e — > and A = 1. 
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